General Bounds on Statistical Query Learning and PAC Learning with Noise via Hypothesis Bounding

نویسندگان

  • Javed A. Aslam
  • Scott E. Decatur
چکیده

We derive general bounds on the complexity of learning in the Statistical Query model and in the PAC model with classification noise. We do so by considering the problem of boosting the accuracy of weak learning algorithms which fall within the Statistical Query model. This new model was introduced by Kearns [12] to provide a general framework for efficient PAC learning in the presence of classification noise. We first show a general scheme for boosting the accuracy of weak SQ learning algorithms, proving that weak SQ learning is equivalent to strong SQ learning. The boosting is efficient and is used to show our main result of the first general upper bounds on the complexity of strong SQ learning. Specifically, we derive simultaneous upper bounds with respect to 6 on the number of queries, O(log2:), the Vapnik-Chervonenkis dimension of the query space, O(1og log log +), and the inverse of the minimum tolerance, O(+ log 3 ) . In addition, we show that these general upper bounds are nearly optimal by describing a class of learning problems for which we simultaneously lower bound the number of queries by R(1og f ) and the inverse of the minimum tolerance by a(:). We further apply our boosting results in the SQ model to learning in the PAC model with classification noise. Since nearly all PAC learning algorithms can be cast in the SQ model, we can apply our boosting techniques to convert these PAC algorithms into highly efficient SQ algorithms. By simulating these efficient SQ algorithms in the PAC model with classification noise, we show that nearly all PAC algorithms can be converted into highly efficient PAC algorithms which *Author was supported by DARPA Contract N00014-87-K825 and by NSF Grant CCR-89-14428. Author’s net address: jaaQtheory.lca.rit.edu +.Author was supported by an NDSEG Fellowship and by NSF Grant CCR-92-00884. Author’s net address: aedQdas.harvard.edu Scott E. Decaturt Aiken Computation Laboratory

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Noise tolerant algorithms for learning and searching

We consider the problem of developing robust algorithms which cope with noisy data. In the Probably Approximately Correct model of machine learning, we develop a general technique which allows nearly all PAC learning algorithms to be converted into highly e cient PAC learning algorithms which tolerate noise. In the eld of combinatorial algorithms, we develop techniques for constructing search a...

متن کامل

Statistical Query Learning (1993; Kearns)

The problem deals with learning {−1, +1}-valued functions from random labeled examples in the presence of random noise in the labels. In the random classification noise model of of Angluin and Laird [1] the label of each example given to the learning algorithm is flipped randomly and independently with some fixed probability η called the noise rate. The model is the extension of Valiant’s PAC m...

متن کامل

An Effective Approach for Robust Metric Learning in the Presence of Label Noise

Many algorithms in machine learning, pattern recognition, and data mining are based on a similarity/distance measure. For example, the kNN classifier and clustering algorithms such as k-means require a similarity/distance function. Also, in Content-Based Information Retrieval (CBIR) systems, we need to rank the retrieved objects based on the similarity to the query. As generic measures such as ...

متن کامل

Cost Complexity of Proactive Learning via a Reduction to Realizable Active Learning

Proactive Learning is a generalized form of active learning with multiple oracles exhibiting different reliabilities (label noise) and costs. We propose a general approach for Proactive Learning that explicitly addresses the cost vs. reliability tradeoff for oracle and instance selection. We formulate the problem in the PAC learning framework with bounded noise, and transform it into realizable...

متن کامل

Nearly Tight Bounds on $\ell_1$ Approximation of Self-Bounding Functions

We study the complexity of learning and approximation of self-bounding functions over the uniform distribution on the Boolean hypercube {0, 1}n. Informally, a function f : {0, 1}n → R is self-bounding if for every x ∈ {0, 1}n, f(x) upper bounds the sum of all the n marginal decreases in the value of the function at x. Self-bounding functions include such well-known classes of functions as submo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Inf. Comput.

دوره 141  شماره 

صفحات  -

تاریخ انتشار 1993